independent model
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- North America > Canada (0.04)
- Asia > Middle East > Jordan (0.04)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.94)
- Health & Medicine > Therapeutic Area > Neurology (0.69)
- Europe > Italy (0.04)
- South America > Colombia (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > Middle East > Jordan (0.04)
- Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.94)
- Health & Medicine > Therapeutic Area > Neurology (0.69)
A Implementation Details
For all the experimental results on ResNet-29 v2 (He et al., 2016b), we use a batch size The network is trained with Adam optimizer (Kingma et al., 2015) for 200 epochs. We randomly split the training dataset into training data of 45000 images and 5000 images as the validation set. We train a Wide ResNet-28-10 v2 (Zagoruyko & Komodakis, 2016) to obtain the state-of-the-art accuracy for CIFAR-10 (e.g., Table 2 in the main text). For mixup (Zhang et al., 2018; Thulasidasan et al., 2019), the mixing parameter of two images is For CCA T (Stutz et al., 2020), we observe that training models with adversarial examples bounded We train a Wide ResNet-28-10 v2 (Zagoruyko & Komodakis, 2016) to obtain the state-of-the-art accuracy for CIFAR-100. All the experiments on ImageNet were obtained via training a ResNet-101 v1 (He et al., 2016a) following the training script at The input image is normalized (divided by 255) to be within [0,1].
Model Recycling Framework for Multi-Source Data-Free Supervised Transfer Learning
This situation can give rise to privacy concerns, as organizations may not want to share sensitive information; for instance, healthcare providers may be reluctant to share patient information and security system maintainers may not want to risk sharing facial recognition data for system performance updates. Additionally, there may be issues with obtaining the source data such as when it is hard to retrieve due to technical difficulties or intellectual property restrictions (Li et al., 2020b; Chen et al., 2021; Liang et al., 2020; Ahmed et al., 2021b). Recent advancements in source-free unsupervised domain adaptation (SFUDA) have presented solutions for a scenario where source data is not accessible (Fang et al., 2022). Purposely, SFUDA utilizes pre-trained source models to improve the generalization of a model on an unlabeled target dataset. Our work is similar to other approaches in the field of SFUDA (Li et al., 2020b; Chen et al., 2021; Liang et al., 2020; Ahmed et al., 2021b), in that it addresses the practical scenario where source data is not available during training. Importantly, a crucial aspect is often overlooked by the majority of SFUDA studies. When it is assumed that source data is not accessible, then it cannot be guaranteed that the available source models have been trained on domains related to the target task. And yet, most of the works only have experimented on classic domain adaptation benchmarks, which are somewhat related by design, e.g., Digits-Five (Peng et al., 2019), Office-31 (Saenko et al., 2010), and Office-Home (Venkateswara et al., 2017), i.e,, domains that share the same labels but are dissimilar in feature (and ambient) space. Our approach is unique in that we consider such a source-free supervised transfer learning (SFSTL) setting (Lee et al., 2019), where we do not assume source models are trained on tasks with similar feature spaces or 1
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Europe > Greece (0.04)
- Europe > Czechia > Prague (0.04)
- Asia > Middle East > Jordan (0.04)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Transportation > Ground > Road (0.46)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.71)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Towards Foundation Models for Experimental Readout Systems Combining Discrete and Continuous Data
Giroux, James, Fanelli, Cristiano
We present a (proto) Foundation Model for Nuclear Physics, capable of operating on low-level detector inputs from Imaging Cherenkov Detectors at the future Electron Ion Collider. Building upon established next-token prediction approaches, we aim to address potential challenges such as resolution loss from existing tokenization schemes and limited support for conditional generation. We propose four key innovations: (i) separate vocabularies for discrete and continuous variates, combined via Causal Multi-Head Cross-Attention (CMHCA), (ii) continuous kinematic conditioning through prepended context embeddings, (iii) scalable and simple, high-resolution continuous variate tokenization without joint vocabulary inflation, and (iv) class conditional generation through a Mixture of Experts. Our model enables fast, high-fidelity generation of pixel and time sequences for Cherenkov photons, validated through closure tests in the High Performance DIRC. We also show our model generalizes to reconstruction tasks such as pion/kaon identification, and noise filtering, in which we show its ability to leverage fine-tuning under specific objectives.
- North America > Mexico > Gulf of Mexico (0.14)
- North America > United States (0.04)
Neurosymbolic Reasoning Shortcuts under the Independence Assumption
van Krieken, Emile, Minervini, Pasquale, Ponti, Edoardo, Vergari, Antonio
The ubiquitous independence assumption among symbolic concepts in neurosymbolic (NeSy) predictors is a convenient simplification: NeSy predictors use it to speed up probabilistic reasoning. Recent works like van Krieken et al. (2024) and Marconato et al. (2024) argued that the independence assumption can hinder learning of NeSy predictors and, more crucially, prevent them from correctly modelling uncertainty. There is, however, scepticism in the NeSy community around the scenarios in which the independence assumption actually limits NeSy systems (Faronius and Dos Martires, 2025). In this work, we settle this question by formally showing that assuming independence among symbolic concepts entails that a model can never represent uncertainty over certain concept combinations. Thus, the model fails to be aware of reasoning shortcuts, i.e., the pathological behaviour of NeSy predictors that predict correct downstream tasks but for the wrong reasons.
DuFFin: A Dual-Level Fingerprinting Framework for LLMs IP Protection
Yan, Yuliang, Tang, Haochun, Yan, Shuo, Dai, Enyan
Large language models (LLMs) are considered valuable Intellectual Properties (IP) for legitimate owners due to the enormous computational cost of training. It is crucial to protect the IP of LLMs from malicious stealing or unauthorized deployment. Despite existing efforts in watermarking and fingerprinting LLMs, these methods either impact the text generation process or are limited in white-box access to the suspect model, making them impractical. Hence, we propose DuFFin, a novel $\textbf{Du}$al-Level $\textbf{Fin}$gerprinting $\textbf{F}$ramework for black-box setting ownership verification. DuFFin extracts the trigger pattern and the knowledge-level fingerprints to identify the source of a suspect model. We conduct experiments on a variety of models collected from the open-source website, including four popular base models as protected LLMs and their fine-tuning, quantization, and safety alignment versions, which are released by large companies, start-ups, and individual users. Results show that our method can accurately verify the copyright of the base protected LLM on their model variants, achieving the IP-ROC metric greater than 0.95. Our code is available at https://github.com/yuliangyan0807/llm-fingerprint.
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Genetic Disease (0.68)
Privacy-Aware Lifelong Learning
Özdenizci, Ozan, Rueckert, Elmar, Legenstein, Robert
Lifelong learning algorithms enable models to incrementally acquire new knowledge without forgetting previously learned information. Contrarily, the field of machine unlearning focuses on explicitly forgetting certain previous knowledge from pretrained models when requested, in order to comply with data privacy regulations on the right-to-be-forgotten. Enabling efficient lifelong learning with the capability to selectively unlearn sensitive information from models presents a critical and largely unaddressed challenge with contradicting objectives. We address this problem from the perspective of simultaneously preventing catastrophic forgetting and allowing forward knowledge transfer during task-incremental learning, while ensuring exact task unlearning and minimizing memory requirements, based on a single neural network model to be adapted. Our proposed solution, privacy-aware lifelong learning (P ALL), involves optimization of task-specific sparse subnetworks with parameter sharing within a single architecture. We additionally utilize an episodic memory rehearsal mechanism to facilitate exact unlearning without performance degradations. We empirically demonstrate the scalability of P ALL across various architectures in image classification, and provide a state-of-the-art solution that uniquely integrates lifelong learning and privacy-aware unlearning mechanisms for responsible AI applications. Lifelong learning algorithms enhance the ability of machine learning models to incrementally acquire new skills or integrate new knowledge over time from sequentially observed data (van de V en et al., 2022). This continual learning capability is essential for models to stay relevant in dynamic environments where the observed data distributions change. A widely studied challenge in this setting is to mitigate catastrophic forgetting, addressing the loss of prior knowledge as new tasks are learned. There has been various strategies proposed to prevent forgetting, while exploiting forward knowledge transfer to efficiently improve performance in new tasks.